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Turbo codes were recently proposed by Berrou, Glavieux, and Thitimajshima 
[2], and it has been claimed these codes achieve near- Shannon-limit error correction 
performance with relatively simple component codes and large interleavers. A re- 
quired Eb/N 0 of 0.7 dB was reported for a bit error rate of 10 ~ 5 , using a rate 1/2 
turbo code [2], However , some important details that are necessary to reproduce 
these results were omitted. This article confirms the accuracy of these claims, and 
presents a complete description of an encoder/decoder pair that could be suitable 
for deep-space applications, where lower rate codes can be used. We describe a new 
simple method for trellis termination, analyze the effect of interleaver choice on the 
weight distribution of the code, and introduce the use of unequal rate component 
codes, which yields better performance. 


I. Introduction 

Turbo codes were recently proposed by Berrou, Glavieux, and Thitimajshima [2] as a remarkable step 
forward in high-gain, low-complexity coding. It has been claimed these codes achieve near-Shannon-limit 
error correction performance with relatively simple component codes and large interleavers. A required 
Eb/Ni o of 0.7 dB was reported for a bit error rate (BER) of 10~ 5 , using a rate 1/2 turbo code [2]. However, 
some important details that are necessary to reproduce these results were omitted. The purpose of this 
article is to shed some light on the accuracy of these claims and to present a complete description of an 
encoder/decoder pair that could be suitable for deep-space applications, where lower rate codes can be 
used. Two new contributions are reported in this article: a new, simple method for trellis termination 
and the use of unequal component codes, which results in better performance. 


II. Parallel Concatenation of Convolutional Codes 

The codes considered in this article consist of the parallel concatenation of two convolutional codes 
with a random interleaver between the encoders. Figure 1 illustrates a particular example that will 
be used in this article to verify the performance of these codes. The encoder contains two recursive 
binary convolutional encoders, with M\ and M 2 memory cells, respectively. In general, the two compo- 
nent encoders may not be identical. The first component encoder operates directly on the information 
bit sequence u = {u xr - ,u N ) of length N, producing the two output sequences x u and x lp . The 
second component encoder operates on a reordered sequence of information bits, u', produced by an 
interleaver of length N , and outputs the two sequences x 2i and x 2p . The interleaver is a pseudorandom 
block scrambler defined by a permutation of N elements with no repetitions: a complete block is read 
into the interleaver and read out in a specified permuted order. Figure 1 shows an example where 
a rate r = 1/n = 1/4 code is generated by two component codes with M t = M 2 = M = 4, producing the 
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Fig. 1. Example of an encoder. 


outputs Xi i = u, Xip = u * g a /gb, x 2i — u > an d x 2p — u/ ' 9a/ 9b> where the generator polynomials g a and 
g b have an octal representation of 21 and 37, respectively. Note that various code rates can be obtained 
by puncturing the outputs. 

A. Trellis Termination 

We use the encoder in Fig. 1 to generate a (n(N 4- M), N ) block code. Since the component encoders 
are recursive, it is not sufficient to set the last M information bits to zero in order to drive the encoder 
to the all-zero state, i.e., to terminate the trellis. The termination (tail) sequence depends on the state of 
each component encoder after N bits, which makes it impossible to terminate both component encoders 
with the same M bits. Fortunately, the simple stratagem illustrated in Fig. 2 is sufficient to terminate 
the trellis. Here the switch is in position “A” for the first N clock cycles and is in position “B” for M 
additional cycles, which will flush the encoders with zeros. The decoder does not assume knowledge of 
the M tail bits. The same termination method can be used for unequal rate and memory encoders. 
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Fig. 2. Trellis termination. 
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B. Weight Distribution 

In order to estimate the performance of a code, it is necessary to have information about its minimum 
distance, d, weight distribution, or actual code geometry, depending on the accuracy required for the 
bounds or approximations. The example of turbo code shown in Fig. 1 produces two sets of codewords, 
x i = (Xu, xi p ) and x 2 = (x 2 t, x 2p ), whose weights can be easily computed. The challenge is in finding the 
pairing of codewords from each set, induced by a particular interleaver. Intuitively, we would like to avoid 
pairing low-weight codewords from one encoder with low-weight words from the other encoder. Many such 
pairings can be avoided by proper design of the interleaver. However, if the encoders are not recursive, 
the low-weight codeword generated by the input sequence u = (00 • • • 0000100 • • ■ 000) with a single “1" 
will always appear again in the second encoder, for any choice of interleaver. This motivates the use of 
recursive encoders, where the key ingredient is the recursiveness and not the fact that the encoders are 
systematic. For our example using a recursive encoder, the input sequence u = (00 • ■ • 0010000100 • ■ • 000) 
generates the minimum weight codeword (weight = 6). If the interleaver does not properly “break” this 
input pattern, the resulting minimum distance will be 12. 


However, the minimum distance is not the most important quantity of the code, except for its asymp- 
totic performance, at very high E b /N 0 . At moderate signal-to-noise ratios (SNRs), the weight distribution 
at the first several possible weights is necessary to compute the code performance. Estimating the com- 
plete weight distribution for a large N is still an open problem for these codes. We have investigated the 
effect of the interleaver on the weight distribution on a small-scale example where N = 16. This yields 
an (80,16) code whose weight distribution can be found by exhaustive enumeration. Some of our results 
are shown in Fig. 3(a), where it is apparent that a good choice of the interleaver can increase the mini- 
mum distance from 12 to 14, and, more importantly, can reduce the count of codewords at low weights. 
Figure 3(a) shows the weight distribution obtained by using no interleaver, a reverse permutation, and 
a 4 x 4 block interleaver, all with d = 12. Better weight distributions are obtained by the “random” 
permutation {2, 13, 0, 3, 11, 15, 6, 14, 8, 9, 10, 4, 12, 1, 7, 5} with d = 12, and by the best-found permutation 
{12, 3, 14, 15, 13, 11, 1, 5, 6, 0, 9, 7, 4, 2, 10, 8} with d = 14. For comparison, the binomial distribution is also 
shown. The best known (80,16) linear block code has a minimum distance of 28. For an interleaver length 
of N = 1024, we were only able to enumerate all codewords produced by input sequences with weights 
1, 2, and 3. This again confirmed the importance of the interleaver choice for reducing the number of 
low-weight codewords. Better weight distributions were obtained by using “random” permutations than 
by using structured permutations as block or reverse permutations. 



Fig. 3. The (80, 16) code (a) weight distribution and (b) performance. 
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For the (80,16) code using the best-found permutation, we have compared the performance of a 
maximum-likelihood decoder (obtained by simulation) to that of a turbo decoder with 10 iterations, as 
described in Section 3, and to the union bound computed from the weight distribution, as shown in 
Fig. 3(b). As expected, the performance of the turbo decoder is slightly suboptimum. 


III. Turbo Decoding 

Let u k be a binary random variable taking values in {+1, -1}, representing the sequence of informa- 
tion bits. The maximum a posteriori (MAP) algorithm, summarized in the Appendix, provides the log 
likelihood ratio L(k) given the received symbols y: 


L(k) - log 


P(uk = +l|y) 
P(u k = -l|y) 


(1) 


The sign of L(k) is an estimate, u k , of u k , and the magnitude |L(fc)| is the reliability of this estimate, as 
suggested in [3]. 


The channel model is shown in Fig. 4, where the ni ik ’s and the ni p *’s are independent identica lly 
distributed (i.i.d.) zero-mean Gaussian random variables with unit variance, and p = yj 2EJN 0 = 
y/2rEb/N a is the SNR. A similar model applies for encoder 2. 


" 1 / 



Y 1 ;=P u + n 1/ 


yi p =p x ip +n ip 


Fig. 4. The channel model.. 


Given the turbo code structure in Fig. 1, the optimum decoding rule maximizes either P(u*;|yi, y 2 ) 
(minimum bit-error probability rule) or P(u|y,,y 2 ) (maximum- likelihood sequence rule). Since this 
rule is obviously too complex to compute, we resort to a suboptimum decoding rule [2,3] that sep- 
arately uses the two observations yi and y 2 , as shown in Fig. 5. Each decoder in Fig. 5 computes 
the a posteriori probabilities P{u k \yi, Ui), i - 1,2 see Fig. 6(a), or equivalently the log-likelihood ra- 
tio Li(k) = log (P(u k = +l|y 2 ,u»))/(E(u i: = -l|yi,Ui)) where ui is provided by decoder 2 and u 2 is 
provided by decoder 1 (see Fig. 6(b)). The quantities Ui correspond to “new data estimates,” “innova- 
tions,” or “extrinsic information” provided by decoders 1 and 2, which can be used to generate a priori 
probabilities on the information sequence u for branch metric computation in each decoder. 


The question is how to generate the probabilities P(u t , k \u k ) that should be used for computa- 
tion of the branch transition probabilities in MAP decoding. It can be shown that the probabilities 
P(u k \ui, k ) or, equivalently, log (P(u k = +l|tij i jt)) / (P(u k = — l|ut,/t))> i = 1,2 can be used instead of 
P(u i>fc K) for branch metric computations in the decoders. When decoder 1 generates P(u k |u 2 ,fc) or 
log(P(ttfc = +l|u 2 ,fc)) / (P(u k = -l|u 2 ,fe)) for decoder 2, this quantity should not include the contribu- 
tion due to u uk , which has already been generated by decoder 2. Thus, we should have 


1 p ( Uk = +1 l^,fc) _ j P(it fc = +l|yi,ui,i,-- 
® P(u k = -l|h 2i fc) ® P{uk = -l|yi,«i,i, • • 


^l,k— 1 j > ’ ' ' j ,n) 


( 2 ) 


32 




FEEDBACK 



BITS 


Fig. 5. The turbo decoder. 
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Fig. 6. Input/output of the MAP decoder: (a) a posteriori probability and (b) log-likelihood ratio. 


To compute log ( P{u k = +1| u 2 , k )) / (P(u k = -l|u 2 ,k)), we note [see Fig. 6(a)] that 


P(uk |yi,ui) = 










Since Ui ifc was generated by decoder 2 and deinterleaving is used, this quantity depends only weakly on 
yi an d uij, j / k. Thus, we can have the following approximation: 




• i “i,fc- 1, wi,fc+ii ■ • • , ui , n ) ~ P{u i,k|u/t) = 2P(u* : |ui ifc )P(tt li jt) 


(4) 


Using Eq. (4) in Eq. (3), we obtain 
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P(u fc |yi, ui,i, • • • , ui.fc-i, • • ■ , Ui,jv) 


P( u k\yiiUi)P(u\ > k\yi,ui y i' ) • • - • yuuvQ 

2P(^|ui t fc)P(ui t fc) 


( 5 ) 


It is preferable to work with likelihood ratios to avoid computing probabilities not involving u k (see 
Fig. 6(b)). Define 


Li(k) = log 


P(u k = + l|fij,fc) 
P(u k = -l|u a )’ 


2 = 1,2 


(6) 


From Eqs. (2) and (5), we obtain L 2 n \k) = L^^k) - l \k) at the output of decoder 1, before 
interleaving, for the mth iteration. Similarly, we can obtain (k) — L 2 n \k) at the output 

of decoder 2, after deinterleaving. Using the above definitions, the a priori probabilities can be computed 
as 


P{v*k — H"1 


e Li(k) 

1 4* 


= 1 - P(u k = -l|u» t fc)i 


i = 1,2 


Then the update equation for the mth iteration of the decoder in Fig. 5 becomes 
L ( ™\k) = + a m [L ( 2 m \k) - 4 m) (fc)] , = 1 


( 7 ) 


( 8 ) 


This looks like the update equation of a steepest descent method, where (/c) — L[ m \k) 

the rate of change of L(k) for a given u k , and a m is the step size. 


represents 


Figure 7 shows the probability density function of Li(k) at the output of the second decoder in Fig. 1, 
after deinterleaving and given u k - + 1. As shown in Fig. 7, this density function shifts to the right as 
the number of iterations, m, increases. The area under each density function to the left of the origin 
represents the BER if decoding stops after m iterations. 


At this point, certain observations can be made. Note that L 2 (k f ) at the input of decoder 2 includes 
an additive component 2py\ lk , which contributes to the branch metric computations in decoder 2 at 
observation y2ik> This improves by 3 dB the SNR of the noisy information symbols at the input of 
decoder 2. Similar arguments hold for Zq(fc). An apparently more powerful decoding structure can be 
considered, as shown in Fig. 8. However, the performances of the decoding structures in Figs. 8 and 5 are 
equivalent for a large number of iterations (the actual difference is one-half iteration). If the structure in 
Fig. 8 is used, then the log-likelihood ratio L 2 {k) fed to decoder 2 should not depend on ui k and y' uk , 
and, similarly, L\(k) should not depend on u 2k and y f 2lk . Using analogous derivations based on Eqs. (2) 
through (5), we obtain 


L 2 (k) = L l (k)-L l (k)-2py' uk 


Li{k) = L 2 {k) - L 2 (k) - 2 py f 2ik 


where y' u is the sum of yu with the deinterleaved version of y 2 i and y 2i is the sum of y 2 * with the 
interleaved version of y^. Thus, the net effect of the decoding structure in Fig. 8 is to explicitly pass to 
decoder 2 the information contained in y^ (and vice versa), but to remove the identical term from the 
input log- likelihood ratio. 
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Fig. 7. The reliability function. 
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Fig. 8. An equivalent turbo decoder. 





IV. Performance 

The performance obtained by turbo decoding the code in Fig. 1 with random permutations of lengths 
N - 4096 and N = 16384 is compared in Fig. 9 to the capacity of a binary-input Gaussian channel for 
rate r = 1/4 and to the performance of a (15,1/4) convolutional code originally developed at JPL for 
the Galileo mission. At BER = 5 x 10“ 3 , the turbo code is better than the (15,1/4) code by 0.25 dB for 
N = 4096 and by 0.4 dB for N = 16384. 



EtJ N o> dB 


Fig. 9. Turbo codes performance, r- 1/4. 

So far we have considered only component codes with identical rates, as shown in Fig. 1. Now we 
propose to extend the results to encoders with unequal rates, as shown in Fig. 10. This structure improves 
the performance of the overall rate 1/4 code, as shown in Fig. 9. The gains at BER = 5 x 10 relative to 
the ( 15 , 1 / 4 ) code are 0.55 dB for N = 4096 and 0.7 dB for N = 16384. For both cases, the performance 
is within 1 dB of the Shannon limit at BER = 5 x 1CT 3 , and the gap narrows to 0.7 dB for N = 16384 
at a low BER. 


V. Conclusions 

We have shown how turbo codes and decoders can be used to improve the coding gain for deep-space 
communications while decreasing the decoding complexity with respect to the large constraint-length 
convolutional codes currently in use. These are just preliminary results that require extensive further 
analysis. In particular, we need to improve our understanding of the influence of the interleaver choice 
on the code performance, to explore the sensitivity of the decoder performance to the precision with 
which we can estimate E^/N 0 , and to establish whether there might be a flattening of the performance 
curves at higher Eb/N 0 , as it appears in one of the curves in Fig. 9. An interesting theoretical question 
is to determine how random these codes can be so as to draw conclusions on their performance based on 
comparison with random coding bounds. 
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In this article, we have explored turbo codes using only two encoders, but similar constructions can 
be used to build multiple-encoder turbo codes and generalize the turbo decoding concept to a truly 
distributed decoding system where each subdecoder works on a piece of the total observation and tentative 
estimates are shared among decoders until an acceptable degree of consensus is reached. 


ENCODER 1 



ENCODER 2 

Fig. 10. The two-rate encoder. 
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Appendix 

The MAP Algorithm 


Let u k be the information bit associated with the transition from time k — 1 to time k, and use s as 
an index for the states. The MAP algorithm [1,4] provides the log likelihood given the received symbols 
y k , as shown in Fig. A-l. 


y ^ 

MAP ALGORITHM 

L (% 



II 


Fig. A-1. The MAP algorithm. 


r / 1 x . P(u k = + l|y) , E,,Ey 7+iO/fc,^s)<*fc-i(s')/?fcOO 
L[k) ~ s P(u k = -l|y) g £ s E,, 7-i(l/fc, s', s)a k ^(s')0k(s) 


(A-l) 


The estimate of the transmitted bits is then given by sign[L(k )] and their reliability by \L(k)\. In order 
to compute Eq. (A-l), we need the forward and backward recursions, 


, . _ E a ' Et=±! 7»(yfc, s', s)qfc-i(s') 
Qfc(sj ■ 'y j (y k ,s>,s)a k - 1 ( s >) 


„ . . T,s’ Tn=±i'ri(yk+i,s,s')f3 k+ i(s') 

Pk[s) ~ Ej=±i 7j(yfc+i, s', s)a k (s') 


where 
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7 i(yk,s',s) = 


if transition s' — > 5 is allowable for u k = i 
otherwise 


(A-3) 


/ r ) k e p 52 »= 1 

lo 




P = \/2(^a/^o), = P(u/c = ±l|ujt), except for the first iteration in the first decoder, where rj k = 1/2, 

and are code symbols. The operation of these recursions is shown in Fig. A-2. The evaluation of 
Eq. (A-l) can be organized as follows: 


Step 0: a 0 (0) = 1 a 0 (s) = 0, Vs ^ 0 

0n( 0) = 1 Pn{s) = 0, Vs ^ 0 

Step 1: Compute the 7 /t’s using Eq. (A-3) for each received set of symbols y k - 
Step 2: Compute the a k s using Eq. (A-2) for k = 1, • • • , N. 

Step 3: Use Eq. (A-2) and the results of Steps 1 and 2 to compute the /Vs for k = N, ■ • * , 1. 
Step 4: Compute L(k) using Eq. (A-l) for k = 1, • ■ « , N. 



Fig. A-2. Forward and backward recursions. 
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